Learning Tractable Probabilistic Models for Fault Localization
نویسندگان
چکیده
In recent years, several probabilistic techniques have been applied to various debugging problems. However, most existing probabilistic debugging systems use relatively simple statistical models, and fail to generalize across multiple programs. In this work, we propose Tractable Fault Localization Models (TFLMs) that can be learned from data, and probabilistically infer the location of the bug. While most previous statistical debugging methods generalize over many executions of a single program, TFLMs are trained on a corpus of previously seen buggy programs, and learn to identify recurring patterns of bugs. Widely-used fault localization techniques such as TARANTULA evaluate the suspiciousness of each line in isolation; in contrast, a TFLM defines a joint probability distribution over buggy indicator variables for each line. Joint distributions with rich dependency structure are often computationally intractable; TFLMs avoid this by exploiting recent developments in tractable probabilistic models (specifically, Relational SPNs). Further, TFLMs can incorporate additional sources of information, including coverage-based features such as TARANTULA. We evaluate the fault localization performance of TFLMs that include TARANTULA scores as features in the probabilistic model. Our study shows that the learned TFLMs isolate bugs more effectively than previous statistical methods or using TARANTULA directly.
منابع مشابه
Fault Localization for Java Programs using Probabilistic Program Dependence Graph
Fault localization is a process to find the location of faults. It determines the root cause of the failure. It identifies the causes of abnormal behaviour of a faulty program. It identifies exactly where the bugs are. Existing fault localization techniques are Slice based technique, ProgramSpectrum based Technique, Statistics Based Technique, Program State Based Technique, Machine learning bas...
متن کاملLearning and Exploiting Relational Structure for Efficient Inference
Learning and Exploiting Relational Structure for Efficient Inference Aniruddh Nath Chair of the Supervisory Committee: Professor Pedro Domingos Computer Science & Engineering One of the central challenges of statistical relational learning is the tradeoff between expressiveness and computational tractability. Representations such as Markov logic can capture rich joint probabilistic models over ...
متن کاملThe Bayesian Network based program dependence graph and its application to fault localization
Fault localization is an important and expensive task in software debugging. Some probabilistic graphical models such as probabilistic program dependence graph (PPDG) have been used in fault localization. However, PPDG is insufficient to reason across nonadjacent nodes and only support making inference about local anomaly. In this paper, we propose a novel probabilistic graphical model called B...
متن کاملActive Fault Diagnosis for Nonlinear Systems with Probabilistic Uncertainties
Stringent requirements on safety and availability of high-performance systems necessitate reliable fault detection and isolation in the event of system failures. This paper investigates active fault diagnosis of nonlinear systems with probabilistic, time-invariant uncertainties of the parameters and initial conditions. A probabilistic model-based approach is presented for the design of auxiliar...
متن کاملA Generic Approach for Robust Probabilistic Estimation with Graphical Models
Probabilistic estimation using graphical models plays an important role in today’s intelligent and autonomous systems. This paper summarizes our work on robust probabilistic estimation using such models. This robustness, i.e. the algorithmic fault-tolerance in the presence of outliers is crucial for any autonomous system aiming at long-term operation. We show how probabilistic estimation using ...
متن کامل